usage: clc_novo_assemble [options]
De novo assemble some reads and output contig sequences in fasta format.
Options:
-h / --help: Display this message
-q / --reads: The files following this option are read files. (may be used
several times)
-i <file1> <file2> / --interleave <file1> <file2>: Interleave the sequences
in two files, alternating between the files when reading the
sequences. Only valid for read files. (may be used several times)
-o <file> / --output <file>: Give the output fasta file (required)
-m <n> / --min-length <n>: Set the minimum contig length to output (default =
200)
-w <n> / --wordsize <n>: Set the word size for the de Bruijn graph (default
is automatic based on input data sNGS cell on SRA dataize)
-v / --verbose: Output various information while running.
-p <par> / --paired <par>: Set the paired read mode for the read files
following this option. (may be used several times)
par consists of four strings: <mode> <dist_mode> <min_dist> <max_dist>
mode is ff, fb, bf, bb and sets the relative orientation of read one and
two in a pair (f = forward, b = backward)
dist_mode is ss, se, es, ee and sets the place on read one and two to
measure the distance (s = start, e = end)
A typical use would be "-p fb ss 180 250" which means that the reads are
inverted and pointing towards each other. The distance includes both the
reads and the sequence between them. The distance may be between 180 and
250, both included.
It is also allowed to insert a "d" before the mode. This indicates that
the reads in the following file(s) should only be used for their paired end
information and not to build initial contigs. E.g. "-p d fb ss 180 250".
To explicitly say that the following reads are not paired, use "no" for
par, i.e. "-p no".
For paired end reads split in two files, use the -i option.
--cpus <n>: Set the number of cpus to use.
--no-progress: Disable progress bar.
Examples:
De novo assembly of a single file with reads:
clc_novo_assemble -o contigs.fasta -q reads.fasta
De novo assembly of two interleaved files with paired end reads:
clc_novo_assemble -o contigs.fasta -p fb ss 180 250 -q
-i reads1.fq reads2.fq